Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Accelerating parallel searching similar multiple patterns from data streams by using MapReduce
FU Chen, ZHONG Cheng, YE Bo
Journal of Computer Applications    2017, 37 (1): 37-41.   DOI: 10.11772/j.issn.1001-9081.2017.01.0037
Abstract572)      PDF (941KB)(474)       Save
The effective storage mode for time series was designed on Hadoop Distributed File System (HDFS), the sub-series were distributed to the compute nodes on Hadoop cluster by applying Distributed Cache tool, and the matrix of dynamic time warping distances was partitioned into several sub-matrixes. Based on MapReduce programming mode, by parallel computing sub-matrixes in each back-diagonal iteratively, the parallel computation of dynamic time warping distances was implemented, and an efficient parallel algorithm for searching similar patterns from data streams was developed by improving pruning redundant computation. The experimental results on the data set of snow depth long time series in China show that when the length of each time series is equal to or longer than 5000, the required time of parallel computing dynamic time warping distances is less than that of the corresponding sequential computation, and when the length of each time series is equal to or longer than 9000, the more the compute nodes used, the less the required parallel computation time; furthermore, when the length of each pattern is equal to or longer than 4000 and the number of compute nodes is equal to or larger than 5, the required time of parallel searching similar sub-series from data streams is 20% of the corresponding sequential searching time.
Reference | Related Articles | Metrics
Improved global parameterization method
HONG Cheng, ZHANG Dengyi, SU Kehua, WU Xiaoping, ZHENG Changjin
Journal of Computer Applications    2016, 36 (9): 2584-2589.   DOI: 10.11772/j.issn.1001-9081.2016.09.2584
Abstract367)      PDF (914KB)(249)       Save
Focusing on the issue that non-zero genus surface parameterization has large deformation and high computational complexity, an improved global parameterization approach based on holomorphic 1-form was proposed, which starts from the gradient field and adapts easier and faster method to compute homology and cohomology group. Firstly, a simplified cut graph method was used to construct homology group to determine the topology. Secondly, cohomology group of the linear space formed by the gradient field was calculated by defining special harmonic function to figure out closed 1-form. Thirdly, homology group was diffused to harmonic 1-form through minimizing harmonic energy. Finally, holomorphic 1-form was computed by combining linearly harmonic 1-form and the parameterization was obtained by integrating holomorphic 1-form on the surface basic domain. Theoretical analysis of homology group and cohomology group shows that the parameterization is a global, border-free conformal mapping. Experimental results based on non-zero genus model show that, compared with the former global parameterization based on holomorphic 1-form, the proposed algorithm has better visual effect, smaller average error and higher operation efficiency.
Reference | Related Articles | Metrics
Parallel solving shortest common superstring using genetic algorithm and ant colony optimization
WU Shigang ZHONG Cheng
Journal of Computer Applications    2014, 34 (7): 1857-1861.   DOI: 10.11772/j.issn.1001-9081.2014.07.1857
Abstract284)      PDF (949KB)(538)       Save

According to the capacity of multi-level caches, the population individuality and ant data in CPU main memory were assigned to L3 cache, L2 cache and L1 cache to reduce data transfer overhead among multiple caches during parallel computing. The asynchronous and incomplete transmission was performed between CPU and GPU, and multiple flows were asynchronously executed by multiple GPU kernel functions. The thread number of GPU block was set to the size of 16 times and GPU public memory was divided into bank with the size of 32 times. GPU constant memory was used to store read-only parameters such as cross probability and mutate probability which were read frequently. The read-only big data structure such as string set and overlap matrix were bound to GPU texture memory, and a computation, cache and communication-efficient parallel algorithm for CPU and GPU to coordinate solving shortest common superstring problem was designed and implemented. The experimental results for solving shortest common superstring problem with several sizes show the proposed CPU and GPU parallel algorithm is faster over 70 times than the sequential algorithm.

Reference | Related Articles | Metrics
One-site multi-table and cross multi-table frequent item sets mining with privacy preserving
LIN Rui ZHONG Cheng HUA Pei
Journal of Computer Applications    2013, 33 (12): 3437-3440.  
Abstract526)      PDF (666KB)(330)       Save
To achieve the goal that personal and original information is not disclosed to each other when several parties cooperatively mine several data tables at different computational sites, based on secure triple-party protocol, a triple-site cross multi-table frequent item sets mining algorithm with privacy preserving was proposed in distributed environment with multiple tables at each site. The proposed algorithm disturbed data by generating random numbers, mined frequent item sets of inter-site in parallel, and linked the data with equal-value by common link attribution of the tables among the sites and applied secure protocol to compute the global support of inter-site cross-table frequent item sets. The experimental results show that the proposed algorithm is efficient, and it can not only mine the cross multi-table frequent item sets, but also preserve the private data at each site.
Related Articles | Metrics
Communication-efficient parallel sorting integers sequence on multi-core cluster
KE Qi ZHONG Cheng CHEN Qingyuan LU Xiangyan
Journal of Computer Applications    2013, 33 (03): 821-824.   DOI: 10.3724/SP.J.1087.2013.00821
Abstract794)      PDF (681KB)(438)       Save
A data distribution strategy and a communication-efficient parallel algorithm for sorting integers sequence were proposed on the heterogeneous cluster with multi-core machines. The presented data distribution model properly utilized different computation speed, communication rate and memory capacity of each computing node to dynamically compute the size of the data block to be assigned to each node to balance the loads among nodes. In the proposed parallel sorting algorithm, making use of the characteristic of integers sequence, master node distributed the data blocks to the salve nodes and received the sorted subsequences with two-round mode, each salve node returned its sorted subsequence to master node by bucket-packing method, and master node linked its received sorted subsequences to form directly a final sorted sequence by the bucket mapping in order to reduce the data merge operations with large communication cost. The analysis and experimental results on the heterogeneous cluster with multi-core machines show that the presented parallel sorting integers sequence algorithm is efficient and scalable.
Reference | Related Articles | Metrics